Cross-Sectional Study of Clinical Predictors of Coccidioidomycosis, Arizona, USA

Demographic and clinical indicators have been described to support identification of coccidioidomycosis; however, the interplay of these conditions has not been explored in a clinical setting. In 2019, we enrolled 392 participants in a cross-sectional study for suspected coccidioidomycosis in emergency departments and inpatient units in Coccidioides-endemic regions. We aimed to develop a predictive model among participants with suspected coccidioidomycosis. We applied a least absolute shrinkage and selection operator to specific coccidioidomycosis predictors and developed univariable and multivariable logistic regression models. Univariable models identified elevated eosinophil count as a statistically significant predictive feature of coccidioidomycosis in both inpatient and outpatient settings. Our multivariable outpatient model also identified rash (adjusted odds ratio 9.74 [95% CI 1.03–92.24]; p = 0.047) as a predictor. Our results suggest preliminary support for developing a coccidioidomycosis prediction model for use in clinical settings.

C occidioidomycosis, colloquially known as cocci or Valley fever, is a fungal infection endemic to the southwestern United States and parts of Central and South America (1). Infection occurs through inhalation of an arthroconidium from the dimorphic, soildwelling fungi Coccidioides immitis and C. posadasii. Incidence has increased since 1995, when coccidioidomycosis became a reportable infection (2). During 2016-2018, the Centers for Disease Control and Prevention reported a 32% increase in coccidioidomycosis cases (3). Epidemiologic studies suggest climate change, more frequent soilborne dust exposures, and a growing population of older adults in endemic regions as possible causes for increased coccidioidomycosis rates (4). Despite enhanced surveillance efforts, coccidioidomycosis incidence is underreported (4,5), and estimates suggest ≥150,000 infections annually in the United States (6).
Because of limited ability to prevent Coccidioides exposure in the community and no existing vaccine, coccidioidomycosis poses a substantial burden to patients and healthcare systems in endemic areas (7,8). Most (60%) Coccidioides infections are subclinical, but clinical cases produce protracted respiratory conditions (9,10). Observational studies indicate that 15%-29% of community-acquired pneumonia in endemic areas is caused by coccidioidomycosis (11,12). Diverse and nonspecific manifestations including fatigue, cough, fever, and rash make diagnosis challenging, and coccidioidomycosis can easily be mistaken for other respiratory illnesses, eczema, or bacterial pneumonia. Thus, misdiagnosis and inappropriate treatments are common, and <81% of patients are prescribed an antibacterial drug (5,12). However, few studies have investigated factors associated with increased coccidioidomycosis incidence to support clinical decision-making (13).
Increased incidence and complex clinical manifestations of coccidioidomycosis emphasize the need to improve disease identification in clinical settings. In 2019, we prospectively enrolled participants with suspected coccidioidomycosis to evaluate a novel diagnostic test (14). For this study, we used data from our prior study to develop a coccidioidomycosis prediction model based on demographic, clinical, and laboratory factors. We developed independent models for outpatient and inpatient settings.

Design
During January-December 2019, we collected data from a prospective study that enrolled participants at 2 academic medical centers in southern Arizona, Banner-University Medical Center Tucson, and Banner-University Medical Center Phoenix, and their affiliated outpatient clinics. During that study, we enrolled 402 participants with suspected coccidioidomycosis, which was defined by clinician orders for coccidioidomycosis serologic testing (14). Our protocol was consistent with public health recommendations to test for coccidioidomycosis among patients with pneumonia-like symptoms in endemic areas. Patients with alternative clinical manifestations, such as fibrocavitary or disseminated disease, were also evaluated for coccidioidomycosis. Research coordinators were alerted to potential participants via electronic medical record (Cerner, https://www.cerner.com), when clinicians ordered a coccidioidomycosis screening test, or directly by outpatient clinicians (15). We excluded persons <18 years of age or with a history of coccidioidomycosis. Consenting participants completed a medical questionnaire and provided an additional blood sample (14). The University of Arizona Institutional Review Board provided research approval to enroll participants (project no. 1811085933A011).

Variables
Coccidioidomycosis was our primary outcome of interest, which we defined as confirmatory evidence via positive Coccidioides serologic testing, such as ELISA, immunodiffusion, compliment fixation titers >1:2, or a positive culture. We coded indeterminate ELISA and immunodiffusion results as negative. Demographic data collected included age, sex, race, ethnicity, and length of residence in an endemic area. Participants or their designated proxies reported Demographic and clinical indicators have been described to support identification of coccidioidomycosis; however, the interplay of these conditions has not been explored in a clinical setting. In 2019, we enrolled 392 participants in a cross-sectional study for suspected coccidioidomycosis in emergency departments and inpatient units in Coccidioides-endemic regions. We aimed to develop a predictive model among participants with suspected coccidioidomycosis. We applied a least absolute shrinkage and selection operator to specific coccidioidomycosis predictors and developed univariable and multivariable logistic regression models. Univariable models identified elevated eosinophil count as a statistically significant predictive feature of coccidioidomycosis in both inpatient and outpatient settings. Our multivariable outpatient model also identified rash (adjusted odds ratio 9.74 [95% CI 1.03-92.24]; p = 0.047) as a predictor. Our results suggest preliminary support for developing a coccidioidomycosis prediction model for use in clinical settings.
previous symptoms and length of illness via survey. Laboratory measurements were leukocyte count and differential, hemoglobin, platelet count, serum albumin, and total serum protein. Participants provided an additional blood sample that was used to measure C-reactive protein (CRP), erythrocyte sedimentation rate (ESR), and procalcitonin (PCT) levels. A team of physicians conducted a review of each participant's chart to compile any history of immunocompromised status, such as type 2 diabetes, HIV/AIDS, or immunosuppressive therapies. We identified coccidioidomycosis clinical manifestations by using diagnostic notes and radiographic results.

Analysis
We stratified our analyses by inpatient versus outpatient admission status because of systematic differences in the complexity of clinical presentation and availability of electronic medical record data. We classified race as a binary White or non-White variable because of the low representation of minority racial groups. Continuous variables displayed nonnormal distributions. We used the nonparametric Mann-Whitney U test to evaluate the distribution of continuous variables across groups and Fisher exact test to evaluate categorical variables across groups.
Before model development, we evaluated potential predictor variables for multicollinearity by using variance inflation factors and correlation. We applied a correlation threshold of r>0.7 and identified eosinophil percentage as a colinear feature. We omitted eosinophil percentage from our analysis because we considered it to be less clinically relevant in contrast to eosinophil count (Appendix Figures  1, 2, https://wwnc.cdc.gov/EID/article/28/6/21-2311-App1.pdf). All numeric variables exhibited nonnormal distributions and were log transformed. We included clinical features, participant symptoms, and age as binary variables within models, and incorporated length of residence, duration of illness, and laboratory markers as continuous measures. To reduce the loss of sample size, we imputed missing data by Gibbs sampling (16,17). We evaluated imputed data stability by replicating variable selection methods for 5 distinct completed datasets. In brief, we imputed numeric variables by using predictive mean matching, we imputed binary variables with logistic regression, and we imputed multiclass variables by using Bayesian polytomous regression. An average of 12 (3.1%) observations were missing from each variable; however, <63 (16.1%) observations were missing for any single feature. Data with the highest number of missing observations were eosinophil count (16.1%), albumin (13.6%), and total protein (13.6%) (Appendix Table 1). We used imputation methods to retain a sufficient sample size for feature selection; listwise deletion resulted in a loss of 156 (40%) observations.

Variable Selection and Evaluation
First, we developed univariable logistic regression models, reporting all parameter estimates in terms of odds ratios (ORs) on imputed data. We constructed multivariable models by using the semi-automated least absolute shrinkage and selection operator (LAS-SO) method on imputed data (18). In brief, LASSO is a selection technique that uses penalization to shrink small regression coefficients to zero. Penalization (lambda) parameters can be selected by using a minimum cross-validated mean squared error (CVMSE) or the CVMSE <1 SD of the minimum. We used the mean of these 2 lambda values to penalize our models. We retained variables with nonzero coefficients in each model. We selected LASSO because of its ability to select influential features in a high-dimensional dataset (i.e., a high number of variables relative to the dataset). Other regression methods often suffer degeneracies when the number of predictors exceeds or is close to the number of observations (19).
We performed leave-one-out cross-validation to calculate predictive performance of multivariable models and obtain corrected estimates of sensitivity, specificity, and predictive values. This internal validation method provides an out-of-sample performance estimate of each model. We used receiver operating characteristic (ROC) area under the curve (AUC) to evaluate predictive performance of our models. ROC AUC uses a combination of sensitivity and specificity to assess predictive performance. An ROC AUC of 1.0 corresponds to perfect discrimination, whereas 0.50 indicates no predictive ability. We performed sensitivity analyses by using standardized laboratory reference ranges (20)(21)(22) and among participants with and without identifiable immunocompromised conditions. We developed supplemental models to identify alternative laboratory thresholds predictive of coccidioidomycosis. We used R version 3.6.3 (23) to conduct analyses and performed multiple imputation by using the mice package (17). We conducted LAS-SO by using the glmnet package in R (24). We considered p<0.05 statistically significant with no correction for multiple testing. We report this study according to STrengthening the Reporting of OBservational studies in Epidemiology (STROBE) guidelines (https:// www.strobe-statement.org) (Appendix Tables 1-9).
Our initial sample consisted of 392 participants with suspected coccidioidomycosis (Figure). Participants were stratified into outpatient (n = 99) and inpatient groups (n = 293). The outpatient group consisted of 35 coccidioidomycosis-positive participants and 64 coccidioidomycosis-negative participants; our inpatient group consisted of 38 coccidioidomycosis-positive participants and 255 coccidioidomycosis-negative partici-pants ( Table 2). The median age for outpatients was 57 years for coccidioidomycosis-positive and 51 years for coccidioidomycosis-negative participants, but we noted no statistical difference in age (p = 0.53) ( Table 2).  †Immunocompromised status was identified as a participant with a weakened immune system at the time of coccidioidomycosis diagnosis, which included participants with type 2 diabetes, HIV/AIDS, lupus, rheumatoid arthritis, or leukemia, and organ transplant recipients and those receiving chemotherapy agents, corticosteroids, and biologic response modifiers.
‡Symptom counts represent the total number of patients reporting the condition.
We developed univariable and multivariable LASSO prediction models for coccidioidomycosis stratified by admission status. Within our outpatient univariable models, positivity was significantly associated with rash (p = 0.006), higher eosinophil count (p = 0.012), and a lower PCT concentration (p = 0.039) ( Table 3). Univariate models suggested eosinophilia (>0.50 × 10 3 /µL) is predictive of coccidioidomycosis (Appendix Table 3). Our inpatient univariable models identified higher eosinophil count, higher serum protein, lower age, lower CRP concentration, non-White racial identification, and rash as predictors of coccidioidomycosis, but muscle aches and immunocompromised status were negatively associated with disease (Table 4).
Selected features for our outpatient multivariable model included rash, shortness of breath, PCT, platelet count, and eosinophil count (Table 3); however, only rash was significantly associated with a coccidioidomycosis-positive test result (adjusted OR [aOR] 9.74, 95% CI 1.03-92.24). Outpatient multivariate models did not identify eosinophil count at any level as a predictive marker (Appendix Table 4).
Features selected in multivariable models were identical in replicated imputation datasets, suggesting consistency in variable selection. Sensitivity analyses performed by removing 198 immunocompetent outpatient participants similarly identified rash and elevated eosinophil count as predictors of coccidioidomycosis positivity (Appendix Table 5). Specificity in our immunocompetent outpatient model was lower (24.0%) than for the full model (69.5%), but sensitivity modestly improved (86.5%) in contrast to the full outpatient model (72.7%) (Appendix Table 6). After removing immunocompromised participants from our inpatient population, we identified no predictive features in either univariable or multivariable models. Univariable models using clinical breakpoints for laboratory measures were directionally consistent with our main results. Outpatient univariable models identified procalcitonin and eosinophil count as major predictors; however, no laboratory predictors were identified in inpatient models after using standardized reference ranges (Appendix Table  3). Multivariable modeling without stratification by admission status identified a similar feature set compared with our inpatient model because of the large sample size of this group relative to our outpatient population (Appendix Table 7). No predictors were identified for either inpatient or outpatient groups using an immunocompromised-only population. Variables identified in models including participants with acute pulmonary symptoms were directionally consistent with our main findings (Appendix Tables 8, 9).

Discussion
We found preliminary evidence for several markers that could predict coccidioidomycosis based on admission status. Although <40% of outpatient and inpatient participants had rash, our results suggest that rash might support coccidioidomycosis identification better than other symptoms, such as shortness of breath and muscle aches. In outpatient settings, PCT might help differentiate between a bacterial and Coccidioides infection. However, for inpatient settings, conventional indicators, including CRP level and immunocompromised status, might be concealed by comorbidities and high inflammatory markers typical to admitted patients and reduce their efficacy as predictive risk factors. Our models suggest elevated eosinophil count could be a viable biomarker to signal coccidioidomycosis in either clinical setting.
Both our univariable analyses and multivariable models among outpatients indicated rash as a major  predictor of coccidioidomycosis. Our results were likely driven by the low incidence of rash among coccidioidomycosis-negative participants (4.7%) compared with coccidioidomycosis-positive participants (45.7%). This finding might emphasize the utility of rash as a unique marker of coccidioidomycosis, considering the comparatively low occurrence of this symptom in the outpatient population. Our findings are consistent with previous studies suggesting rash is more frequently identified among coccidioidomycosis cases than among cases of other common respiratory infections (25). PCT was negatively associated with positive status, but elevated eosinophil count was a predictive marker of coccidioidomycosis. Laboratory markers were not predictive in our multivariable model; however, low serum PCT levels previously have been reported in persons with coccidioidomycosis (26). Lower PCT is consistent with the cell-mediated immune response against Coccidioides infection because the production of interferon gamma from type-1 T-helper cells impedes PCT upregulation (27). Previous studies also have indicated elevated eosinophil counts among persons with coccidioidomycosis (28,29). Our results substantiate previous recommendations that eosinophilia heightens suspicion of Coccidioides infection (30). Our univariable analyses for inpatients identified negative associations with age, muscle aches, immunocompromised status, and CRP with coccidioidomycosis positivity, but non-White racial status, rash, eosinophil count, and total protein were positive predictive markers of disease. Our multivariable model selected an identical feature set, but only lower incidence of muscle aches and a higher eosinophil count remained statistically significant. Some of our null findings could be explained by the high concentration of immunocompromised participants in the inpatient setting, because these patients often have established coccidioidomycosis risk factors at admission. Furthermore, previous evidence suggests that 20%-50% of specimens from immunocompromised persons test false-negative by Coccidioides serologies (31); thus, false-negative test results among coccidioidomycosis-negative participants might have been artificially inflated in our study. Of note, older age is a well-established coccidioidomycosis risk factor because of the decline in immune function and higher prevalence of chronic diseases among older persons (32); substantial evidence also suggests that immunocompromised persons are more susceptible (33). Therefore, the predictive capacity of these risk factors might be limited by the intersecting clinical patterns of coccidioidomycosis and other diseases in the inpatient setting. We also identified CRP as a negative predictor for coccidioidomycosis. As a generalized blood test marker, CRP might have detected higher inflammation for other conditions among inpatients. Like our outpatient results, LASSO selection incorporated eosinophil count into our multivariable model, indicating that eosinophil levels >0.20 × 10 3 /µL might be predictive of Coccidioides infection in inpatient settings. Inpatient models did not identify PCT as a negative predictor of coccidioidomycosis.
Our results differ from risk factors previously identified by Yozwiak et al. (13), who developed a model using healthy college-aged students. Although these previously identified risk factors might have practical value for estimating relative risk in a healthy population, the inconsistent feature set with our study suggests previous results have limited transferability to a more diverse clinical population. For example, Yozwiak et al. reported male sex, shorter length of residence in coccidioidomycosis-endemic areas, and shorter duration of symptoms as independent risk factors for coccidioidomycosis, which we did not detect as predictors of disease in our study. Yozwiak et al. further reported higher ESR rates and lower lymphocyte levels were associated with disease. Although eosinophil count was indicative of coccidioidomycosis in our study, we did not identify ESR or other cell types as statistically significant predictive markers.
We describe novel coccidioidomycosis prediction models for inpatient and outpatient clinical settings using an agnostic feature selection technique. We constructed models by using data from our previous cross-sectional study and leveraged these data to substantiate risk factors previously associated with coccidioidomycosis, including clinical, demographic, and laboratory variables. We identified markers that might identify coccidioidomycosis before diagnostic testing and distinctive predictive features based on admission status. We stratified models by inpatient and outpatient groups because of the unique features identified within each clinical setting. Our study identified several clinical features in outpatient and inpatient settings, but screening for Coccidioides in endemic settings remains invaluable. Although negative clinical features, such as PCT, muscle aches, or shortness of breath, might be indicative of an alternative diagnosis, we emphasize that the presence of these markers should not deter testing.
Limitations of our study include a reduced sample size used to develop our models, in part due to our stratification, which might have hampered our ability to accurately estimate predictive markers of coccidioidomycosis. We were further unable to apply clinical breakpoints for laboratory measures because of reduced granularity of binary measures and therefore report the effect of continuous variables. We attempted to minimize feature selection biases by using LASSO to construct models; LASSO offers several benefits over alternative feature selection methods, but our impartial approach might have inappropriately eliminated collinear or other necessary control variables. We additionally recognize that participants with established coccidioidomycosis markers might have been preferentially tested during enrollment, resulting in selection bias, and influencing marker selection. The relative infrequency of identified features further hinders the clinical utility of leveraging these markers to identify coccidioidomycosis and emphasizes the importance of diagnostic testing.
Our study's strengths include that we used a novel multidimensional dataset to evaluate established and suspected coccidioidomycosis risk factors. Stratification reveals substructures within clinical settings that could improve disease identification and diagnosis. Our sensitivity analyses using immunocompetent patients further increases confidence in selected features, because rash and higher eosinophil count were similarly predictive of coccidioidomycosis in the outpatient setting.
Public health recommendations are to test for Coccidioides among patients with pneumonialike symptoms in endemic areas. However, the complex and often nonspecific clinical manifestations of coccidioidomycosis indicate a need to improve disease identification. Coupled with the introduction of coronavirus disease in 2019, differentiating between coccidioidomycosis and other pneumonias remains vital for the rapid diagnosis and treatment of disease. The limited accuracy of our models, however, indicate the need for a more robust data source for model development. Replication in a larger clinical study incorporating other endemic regions could provide insight into additional predictive markers for more specific clinical manifestations. Our study identifies surrogate markers in a clinical setting that might provide a developmental framework for future predictive models.
In conclusion, we developed prediction models for multiple clinical settings to support identification of coccidioidomycosis before diagnostic testing. Prediction models could guide the clinical decision-making process to test for coccidioidomycosis, expedite identification of more serious disease complications, and decrease the use of unnecessary diagnostic tests or antimicrobial agents.